An Integrated Circuit Floating Point Accumulator
نویسنده
چکیده
منابع مشابه
Efficient Reproducible Floating Point Summation and BLAS
We define reproducibility to mean getting bitwise identical results from multiple runs of the same program, perhaps with different hardware resources or other changes that should ideally not change the answer. Many users depend on reproducibility for debugging or correctness [1]. However, dynamic scheduling of parallel computing resources, combined with nonassociativity of floating point additi...
متن کاملAn Fpga-specific Approach to Floating-point Accumulation and Sum-of-products Lip Research Report Rr2008-22
This article studies two common situations where the flexibility of FPGAs allows one to design application-specific floating-point operators which are more efficient and more accurate than those offered by processors and GPUs. First, for applications involving the addition of a large number of floating-point values, an ad-hoc accumulator is proposed. By tailoring its parameters to the numerical...
متن کاملDesign-space exploration for the Kulisch accumulator
Floating-point sums and dot products accumulate rounding errors that may render the result very inaccurate. To address this, Kulisch proposed to use an internal accumulator large enough to cover the full exponent range of floating-point. With it, sums and dot products become exact operations. This idea failed to materialize in general purpose processors, as it was considered to slow and/or too ...
متن کاملPerformance Analysis of Floating Point MAC Unit
In order to meet the requirements in real time DSP applications MAC unit is required. The speed of the MAC unit determines the overall performance of the system. MAC unit basically consists of Multiplier, adder and an accumulator unit. In most of the cases floating point adder/subtractor and a multiplier are presented in IEEE-754 format for single precision format. In this research work MAC uni...
متن کاملUsing Delayed Addition Techniques to Accelerate Integer and Floating- Point Calculations in Configurable Hardware
This paper proposes and evaluates an approach for improving the performance of arithmetic calculations via delayed addition. Our approach employs the idea used in Wallace trees to delay addition until the end of a repeated calculation such as accumulation or dot-product; this effectively removes carry propagation overhead from the calculation’s critical path. We present integer and floating-poi...
متن کامل